AITopics | Paramaribo

Large language models (LLMs) like transformers have impressive in-context learning (ICL) capabilities; they can generate predictions for new queries based on input-output sequences in prompts without parameter updates. While many theories have attempted to explain ICL, they often focus on structured training data similar to ICL tasks, such as regression. In practice, however, these models are trained in an unsupervised manner on unstructured text data, which bears little resemblance to ICL tasks. To this end, we investigate how ICL emerges from unsupervised training on unstructured data. The key observation is that ICL can arise simply by modeling co-occurrence information using classical language models like continuous bag of words (CBOW), which we theoretically prove and empirically validate. Furthermore, we establish the necessity of positional information and noise structure to generalize ICL to unseen data. Finally, we present instances where ICL fails and provide theoretical explanations; they suggest that the ICL ability of LLMs to identify certain tasks can be sensitive to the structure of the training data.

icl, in-context learning, scenario, (12 more...)

arXiv.org Machine Learning

2406.00131

Country:

South America > Suriname > Paramaribo District > Paramaribo (0.04)
North America > United States > Michigan (0.04)
Europe > Liechtenstein (0.04)
(6 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Towards a general purpose machine translation system for Sranantongo

Zwennicker, Just, Stap, David

arXiv.org Artificial IntelligenceDec-13-2022

Machine translation for Sranantongo (Sranan, srn), a low-resource Creole language spoken predominantly in Surinam, is virgin territory. In this study we create a general purpose machine translation system for srn. In order to facilitate this research, we introduce the SRNcorpus, a collection of parallel Dutch (nl) to srn and monolingual srn data. We experiment with a wide range of proven machine translation methods. Our results demonstrate a strong baseline machine translation system for srn.

artificial intelligence, natural language, translation system, (17 more...)

arXiv.org Artificial Intelligence

2212.06383

Country:

South America > Suriname > Paramaribo District > Paramaribo (0.05)
Europe > Netherlands > North Holland > Amsterdam (0.05)
Europe > Belgium > Brussels-Capital Region > Brussels (0.05)
Asia > Thailand > Phuket > Phuket (0.05)

Genre: Research Report > New Finding (0.91)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback